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Abstract 

With the exponential growth in the world population and the constant increase 
in human mobility, the danger of outbreaks of epidemics is raising. Especially 
in high density urban areas such as public transport and transfer points, where 
people come in close proximity of each other, we observe a dramatic increase 
in the transmission of airborne viruses and related pathogens. It is essential to 
have a good understanding of the 'transmission highways' in such areas, in order 
to prevent or to predict the spreading of infectious diseases. The approach we 
take is to combine as much information as is possible, from all relevant sources 
and integrate this in a simulation environment that allows for scenario testing 
and decision support. In this paper we lay out a novel approach to study Urban 
Airborne Disease spreading by combining traffic information, with geo-spatial 
data, infection dynamics and spreading characteristics. 
Keywords: Geographical Information System (GIS), Multi- Agent Systems 
(MAS), Infectious Diseases, Epidemics 



1. Introduction 

City-level airborne epidemics is a threat to healthy living. With the ex- 
ponential growth in the world population and the constant increase in human 
mobility, the danger of outbreaks of epidemics is raising. For example, the 
novel Influenza A (H1N1), also known as Human Swine Influenza/Swine Flu, 
spreading internationally from Mexico in 2009, has caused a serious epidemic in 
China. China is highly susceptible to pandemic influenza A (H1N1) due to its 
big population and high residential density. According to the Ministry of Health 
of China, until 30th Sep 2009, the provinces in China mainland had reported 
19589 confirmed cases, 14348 cured cases, 10 sever cases and a few death cases 
(Ministry of Health of China, 2009). 
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In high density urban areas such as public transport and transfer points, 
where people come in close proximity of each other, we observe a dramatic in- 
crease in the transmission of airborne viruses and related pathogens. In order 
to elaborately model and simulate the airborne epidemics, the city under study 
needs to be modeled in detail from the infrastructural aspect. We utilize the 
Geographic Information System (GIS) technology to model the infrastructure 
of a city which might be threatened by certain epidemic attacks. GIS is a com- 
bination of database management capabilities for collecting and storing large 
amounts of geospatial data, together with spatial analysis capabilities to in- 
vestigate geospatial relationships among the entities represented by that data, 
plus map display capabilities to portray the geospatial relationships in two- 
and three-dimensional map form (Nyerges et al., 2009). GIS facilitates storing, 
querying and visualizing city infrastructure including roads, regions with diverse 
functionality, public transportation and so forth. We also address path routing 
based on city transportation to capture transmissions that occur to localities, 
especially public transport. This is because in many developing countries such 
as China, the overly crowded public transportation usually escalates airborne 
epidemics. 

On the basis of the geo-spatial information, we model a local population 
that dwell in a city under study and their spatio-dynamical behavior. There is 
growing recognition that the solutions to the most vexing public health problems 
are likely to be those that embrace the behavioral and social sciences as key 
players (Mabry et al., 2008). Human behavior plays an important role in the 
spread of infectious diseases, and understanding the influence of behavior on the 
spread of diseases can be key to improving control efforts (Funk et al., 2010). 
It is essential to have a good understanding of the 'transmission highways' in 
urban areas, in order to prevent or to predict the spreading of infectious diseases. 
Therefore, investigating into the patterns that are relevant for social contacts, 
and consequent airborne virus transmissions, is of great importance. 

In this study lay out a novel approach to study Urban Airborne Disease 
spreading by combining traffic information, with geo-spatial data, infection dy- 
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namics and spreading characteristics. We combine as much information as is 
possible, from all relevant sources and integrate this in a simulation environment 
that allows for scenario testing and decision support. 

2. Model 

2.1. City Modeling 

We discuss city modeling from the aspects of city partitions and traffic (road 
and public transportation) networks. 

2.1.1. Regions and Sublocations 

In order to construct a synthetic city, we break down major metropolitan 
areas into regions and sublocations (SLs) that reside inside each region. 
Regions, or land uses in some studies, are pieces of city land serving different 
purposes such as agriculture, commerce, medication and education etc. Sublo- 
cations, affiliated to a specific region, represent realistic-room like space where 
people conduct their daily activities and have social contacts. 

Each region is categorized into types of agriculture, residence, hospital, 
school, university and recreation etc., due to the main facilities that it pro- 
vides people with, as discussed in references (Vallc et al., 2006; Del Vallc et al., 
2007; Yang et al., 2007). In this study we exclusively consider 6 types of regions 
housing (HR), office (OR), school (SR), university (UR), medical 
(MR) and recreational (RR). Such region partition asks for GIS files that 
comprise clearly partitioned land pieces, and these pieces can be mapped to 
the aforementioned 6 types of region, ignoring those regions (e.g., agricultural 
regions) that contribute less to the spread of diseases. 

A region contains a set of sublocations of different classes. For example, a 
university region (UR) contains office sublocations (offices), residential sublo- 
cations (student dormitories and faculty members' home), classroom subloca- 
tions (classrooms, labs and library space), recreation sublocations (cafeteria, 
clubs, shops, refectories and restaurants etc.) and possibly hospital subloca- 
tions. Specifically, the recreational class includes shops, restaurants, cinemas, 
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supermarkets and all other relevant places which provide services of recreation, 
relaxation or sales of life necessities. In this sutdy we classify sublocations 
as housing (HS), office (OS), classroom (CS), patient room (PS) and 
recreational (RS). Table 1 lists the classifications of regions and secondary 
sublocations in detail. 

TABLE 1 

Additionally, each sublocation is characterized as being either indoor or out- 
door, conveying different transmission probabilities of viruses inside the space. 
For many airborne viruses, outdoor conditions such as sunshine, heat, wind blow 
and air circulation can lower the infection probability between the infected and 
the susceptible. 

2.1.2. The Road and Public Transportation Networks 

The whole city traffic routes are modeled as a road network (RN) and a 
public transportation network (PTN). 

The assemblage of roads in a city can be mapped to a road network (RN). 
Roads, as the transport infrastructure of a city, are composed of road sections 
and crossings. We build up the road network, denoting crossings by nodes and 
sections by edges, as shown in Fig. 1. Each edge, either a straight-line or a 
poly-line, stands for a head-tail combination of realistic sections. Edges can be 
1-dircctional or 2-directional, indicating that they are corresponding to one-way 
or double-way road sections, respectively. A crossing joins several road sections 
together. The number of sections (usually 2, 3 or 4) that a crossing links to 
indicates the connectivity of the crossing. In this way, the whole city roadway 
can be mapped to a complex network 1 of vast nodes and edges. The degree of 
each node shows the number of neighboring sections that this crossing connects 
with. Obviously, the degree of nodes in this road network is greater than or 
equal to 1. 

1 In the context of network theory, a complex network is a network with non-trivial topo- 
logical features that mostly do not occur in simple networks such as lattices (Newman, 2003). 
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FIGURE 1 



The assemblage of public transportation routes can be mapped to a pub- 
lic transportation network (PTN) in terms of lines and stops. The public 
transportation consists of many bus/tram and metro/train lines along which 
buses/trams and metros/trains operate frequently during day time. Buses/trams 
and metros/trains depart every few minutes from the starting stop of a line and 
move towards the destination stop. People get on or off at each stop alongside 
each line. A number of lines join at a stop to make transfer. Every line is 
composed of head-tail line sections. Similar to a crossing in the roadway, a line 
crossing joins several line sections together. Therefore, we construct the PTN 
by denoting crossings by nodes and sections by edges, while stops are scattered 
along both sides of each edge, as shown in Fig. 2. The route of a bus line in 
one direction passes six bus stops (displayed in square) No. 1-6 alongside three 
line sections, while it in the other direction passes stops No. 7-12 alongside the 
same three line sections. 

FIGURE 2 

2.1.3. Travel Routing 

People travel within a city by foot, car, taxi, bike and public transportation. 
We discuss their travel routing by utilizing the previously constructed RN and 
PTN. For simplicity, we focus on the mobility of people inside the city under 
study, leaving out of consideration the relatively less common commutes between 
cities. 

For travels by means of other than public transportation, we consider travel 
routing in two ways. On the one hand, people move along a straight line con- 
necting the start-point (P s ) and the end-point (P e ) for very short distances, 
say, when Distance(P s , P e ) ^ 3 km. So Path = L(P s ,P e ) where L denotes 
the line between the two points. On the other hand, people move along the 
shortest or feasible paths in the RN for longer distances. The fundamental 
theoretic achievements in the field of complex networks can help seek the short- 
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est or feasible paths, for example by using the so-called Dijkstra algorithm for 
node-to-node shortest path computation. Thus, if Distance(P s , P e ) > 3 km, 
the travel routing result will be the combination of 5 parts (see Fig. 3), i.e., 
Path = L(P S , N rs ) + L(N rs , N ns ) + SP(N ns ,N ne ) + L(N ne , N re ) + L(N re , P e ), 
where N rs is the nearest point on road (edges in RN) to P s , N ns is the nearest 
node (in RN) to N rs , SP computes the shortest path (plotted in grey) between 
two points in RN (N ns and N ne in this case), N re is the nearest point on road 
to P e and N ne is the nearest node to iV re . Additionally, the SP result can be 
substituted by other feasible paths if traffic avoidance needs to be considered. 

FIGURE 3 

Traveling by public transportation further complicates routing so that we 
design a breadth-first algorithm as illustrated in Fig. 4. Looking for the shortest 
paths in PTN unnecessarily solve the routing problem because buses/metros 
move along predefined stop-by-stop lines instead of shortest paths. We start 
with P s and P e , i.e., the start-point and the end-point. In order to compute 
paths based on public transportation, a set of stops close to P S) denoted by 
Si, need to obtained first. We search these stops within a given distance of 
P s , and the resulting circle with center point P s and given radius is called the 
extension area of P s . We define the extension operation Ex(stop, radius) as to 
get all the stops inside the circle with center point stop and given radius. Thus 
Si = Ex(P Sl radius). In the same way we can get E\, a set of close stops to P e , 
utilizing the corresponding extension operation Ex(P e , radius). The two sets of 
Si and Ei represent the getting-on stops and getting-off stops that people can 
choose for the sake of traveling from location P s to P e by public transportation. 

The search operation continues iteratively. We denote the resulting set of 
stops from the ith search by Si and the radius adopted for the ith search by 
radiusi. In order to conduct onward searching according to predefined line 
information, we define the operation of obtaining directly reachable stops of a 
given stop as DR(stop), which produces a set of stops coming right next to the 
given stop in all passing-by lines. As shown in Fig. 4, for each stop in 5*1 (three 
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in total except P s ), we can get its DR(stop) set. As an illustration, in Fig. 
4, each stop in S\ is assumed to have only one directly reachable stop. Thus, 
the union of the stops inside the three extension areas (three big circles with 
dashed perimeter) leads to S2 — U Ex(stop2,radius2) where DRS = 

stop 2 £DRS 

U DR(stopi). Therefore, The searching results starting from both P s and 
P e are given in Equ. 1 and Equ. 2, respectively. 

Ex(P s ,radiusi) i = 1 

S = ^ (J \J Ex(stop 2 ,radiusi) i>2 ^ 

stopiSSi-i stop2£DR(stopi) 

Ex(P e ,radiusi) i = 1 

£; ' (J U Ex(stop 2 , radius,) i>2 ^ 

stopiS-Bi-i stop 2 £DR,(stop 1 ) 

The searching termination condition is that 3i, j, SiHEj 0. With minimum 
transits firstly and less time consumption secondly taking priorities over others, 
we end up with the optimal routing from P s to P e , invovling all the stops on 
the way and necessary transits. Please note that the radius parameter for each 
ith search, radiuSi, is tunable. For example, we can set radiusi — 1 km to 
search for stops within a radius of 1 km from P s or P e , and 0.05 km for the rest 
implying that transits between lines can only occur when the distance between 2 
nearby stops is < 0.05 km. If the routing fails, which means that the algorithm 
ends up with Si H Ej — 0, Vi,j, people need to resort to other traveling means, 
e.g., taxi. Arguably, in a maturely developed city, the possibility of routing 
failure is low and bearable for our simulations. 

FIGURE 4 

2.2. The Synthetic Population 

In this study, each person is represented as an agent with own attributes 
and behavior, based on the Multi- Agent Systems theory which has proven to 
be suitable for modeling epidemics (Epstein and Axtell, 1996; Reynolds and 
Dixon, 2000; F., 2001; Koopman, 2006; Auchincloss and Dicz Roux, 2008). In 
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order to synthesize the population in a city under study and investigate how 
people transmit airborne viruses and respond to epidemics, we need to outline 
the attributes of people with respect to epidemiological and sociological char- 
acteristics, and model people's daily behavior, the occurrence of contacts and 
subsequent infections. 

2.2.1. Attributes and Classification of People 

The selection of attributes of people is determined by the virological, medi- 
cal, sociological and demographical data that is available to support our models. 
In general, we consider age, gender, susceptibility to epidemics, immunity to 
certain viruses, social status, infection status (susceptible, infected, infectious, 
treated or cured), Housing SL and Office SL etc. The Housing SL indicates the 
place where a person rest especially at night, including home and dormitories 
etc.; in contrast, the Office SL indicates the place where a person spends time 
on working (for adults) or studying (for students) during day time. Please note 
that study activities of students are also regarded as work activities. Initially 
assigning values to these attributes for each person depends on the actual sta- 
tistical distribution or rules deduced from available data. For example, the age 
distribution can be obtained from national census; the distribution of the dis- 
tance between one's Housing SL and Office SL for people living in a given city 
can be estimated based on questionnaires. Setting one's Housing SL and Office 
SL complies with the following procedure: (I) each person is initially attached 
to a Housing SL, taking into account household formation rules regarding age, 
gender, and family size etc.; (2) a distance value between the Housing SL and 
the Office SL is drawn from a certain distribution; (3) a Office SL is randomly 
selected on the city map at proximately a distance of the above value away from 
the Housing SL. Once set, some attributes are kept constant throughout simula- 
tions, while others can change over simulation time. For instance, the infection 
status of a person can be set susceptible (healthy) at the beginning, and then 
change in response to the occurrences of certain events such as infections. 

All individuals are classified according to age structure and lifestyles, as 



9 



given in Table 2. Subsequent modeling of people's activities (in Sec. 2.2.2) is 
applied to this 5 classifications because it is believed that daily activity patterns 
are related to individuals' socioeconomic characteristics such as household role, 
lifestyle and life cycle (Kulkarni and McNally, 2000; Yang et al., 2007). 

TABLE 2 

2.2.2. Daily Agenda 

People conduct diverse daily activities in sublocations. The daily activities 
consist of working, staying at home, relaxing at different recreational places, 
staying in hospital after getting infected, etc. People engage in each activity in 
a specific class of sublocation (the classification is given in Table 1). Specifically, 
people work in Office SLs, stay home in Housing SLs, shop/do sport/enjoy 
entertainment in Recreational SLs, get treatment in Patient Room SLs and 
study in Classroom SLs. The types of activities, the corresponding places where 
these activities take place and the involved people classes are listed in Table 3, 
based on research reported in (Valle et al., 2006; Yang et al., 2007). Addtionally, 
the reader is referred to Table one in (Del Valle et al., 2007) for the distribution 
of average duration by activity type. For instance, the average duration of Home 
activity is 12 h 24 min with a standard deviation of 5 h 8 min, and the average 
duration of Work activity is 3 h 4 min with a standard deviation of 2 h 29 min. 

TABLE 3 

The generation of the daily agenda of an individual is subject to daily activity 
patterns, depending on which classification this individual is categorized into. 
As for the majority group of people whose main activities are working, we 
generate their daily agenda according to the research in (R and K, 2000; J 
et al., 2008; Min et al., 2008). 6 activity patterns are adopted as shown in Table 
4 where "*" stands for possible activities other than Home and Work. The 
percentages are taken from (Min et al., 2008), based on a survey accomplished 
in a China city Shangyu. In simulations, a Recreation or Medical Care activity 
will be generated to substitute "*" . 
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TABLE 4 



Once an individual has finished an activity, he/she moves by either public or 
personal transportation from the current location to another sublocation where 
the next activity is going to take place. The travel routing based on either RN 
or PTN can be computed according to Sec. 2.1.3, and the required time can 
be also estimated taking into account the travel means, the start-point and the 
end-point. 

Furthermore, individuals' knowledge of global epidemic situation such as 
the alert phases issued by WHO, and their own infection status, can influence 
their behavior which succeedingly is reflected in the generation of daily agenda. 
For example, when one is aware of the severe prevalence of some epidemic or 
the diagnosis of own infection, he/she probably prolongs his/her stay at home, 
decreases work time and avoids crowded places such as recreational SLs. There- 
fore, the parameters for generating activities can be tuned to adapt to different 
situations. 

2.2.3. Infections Due to Contacts 

People encounter and have contacts with others when conducting activities 
or traveling, so they possibly get infected with airborne viruses when epidemics 
are prevalent. Let a be the mean number of transmission events per hour of 
contact between fully infectious and fully susceptible people. For events that 
occur randomly in time, the number of occurrences in a period of time of length 
t obeys a Poisson probability law with parameter at. Thus, the probability of 
no occurrences in time interval t is e at and the probability of at least one occur- 
rence is 1 — e~ CT *. When an infectious individual i and a susceptible individual j 
stay within a distance threshold D* (tunable for epidemics) from each other in 
the same sublocation for a certain period of time (recordable in simulations), 
infection can occur with a probability of Py = 1 — c~ aTij . According to (Chow- 
ell et al., 2007; Del Valle ct al., 2007), a can be estimated based on knowledge 
of past epidemics. For simplicity, each individual is assumed to wander around 
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inside a sublocation during the stay and his/her accurate coordinates are ob- 
tainable. Thus the distance between two that stay in the same sublocation can 
be measured to assess whether it is less than D* . 

2.2.4- Disease Progression 

Disease progression can be simply described by stages of (1) incubation with 
assumed non-infectiousness, (2) symptomatic period with infectiousness and 
(3) recovery/death (Mei et al., 2010a). For some airborne diseases such as in- 
fluenza, a susceptible individual can refrain from getting infected by vaccination. 
Therefore, individuals can become immunized by cither natural immunization 
(recovery from the previous infection) or vaccination. Therefore 3 parame- 
ters of incubation' ^symptomatic and ^vaccination are introduced to our 
model, indicating the duration of the incubation stage, the symptomatic stage 
and the duration that vaccination needs to stimulate immunity, respectively. 
For example, we can set £> incu b a tion = 1 ~ 2 ' ^symptomatic = 1 ~ 7 and 
^vaccination = 7-21 days for influenza A (H1N1) (Mei et al., 2010b). 

3. GIS-based Implementation and Visualization 

Based on available GIS data, we implemented the model and described into 
a simulation environment which can further allows for scenario testing and de- 
cision support. Fig. 5 displays the visualization of a real city. Roads are 
shown as lines, bus/metro stops as squares, buses/metros as stars, and per- 
sons as circles. All these objects are stored, queried and manipulated efficiently 
based on GIS. When simulations are running, stars and circles are moving on 
the 2-dimensional map, representing that buses/metros operate along lines and 
people conduct daily activities according to their own agenda. Thus, we can 
record infection events and make statistical analyses of the spreading of airborne 
infectious diseases, e.g., locating where first infections take place, determining 
in which kind of sublocation (office and buses etc.) most infections occur, and 
illustrating the spreading situation (total patients and infection rate etc.) in 
each city region. 
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FIGURE 5 



4. Conclusion 

We have developed a novel system that integrates the most relevant geo- 
spatial and dynamical information required to assess the potential outbreak 
of airborne diseases in an urban environment. We combine GIS data with 
traffic and mobility patterns as well as knowledge on behavioral aspects. The 
information is represented in a dynamical multi-agent simulation system. The 
system allows for interactively exploring various alternative scenarios to support 
decision making and prevention, prediction and recovery of an outbreak. 

In the future, our study will be focused on the parallelization and distribu- 
tion of the system to support large-scale simulations. Although the system is 
currently being used to understand in retrospect the outbreak of influenza in a 
few selected densely populated small cities in China, the challenge of execution 
performance holds back the application of the system to simulating airborne 
infectious diseases in large cities with a population of, say, millions. 
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Table 1: Regions and Sublocations 



Region Type 


Secondary Sublocation (SL) Classes 




Housing SLs: houses, apartments...; Office SLs: community 


Housing Region 


service offices...; Recreational SLs: shops, community gardens...; 




Classroom SLs: children daycare... 




Office SLs: offices, factory workshops...; Recreational SLs: 


Office Region 


cafeteria, sport places, restaurants... 


School Region (for 


Housing SLs: faculty members' households...; Office SLs: 


elementary, middle 


teachers' offices...; Classroom SLs: classrooms, labs and library 


and high schools) 


space...; Recreational SLs: cafeteria, shops, refectories... 




Housing SLs: student dormitories, faculty members' households...; 




Office SLs: teachers' offices...; Classroom SLs: classrooms, labs 


University Region 


and library space...; Recreational SLs: cafeteria, clubs, shops, 




refectories... 




Office SLs: doctors' offices...; Patient Room SLs: medical 


Medical Region 






wards...; Recreational SLs: refectories... 


Recreational 






Recreational SLs: all kinds of retail, service, meal and shop SLs 


Region 
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Table 2: Age structure for people classification 



Classification 


Description 


Children under 3 


For children under 3 years, it is assumed that they do not have 


years 


independent activities and always stay inside households. 




Their activity patterns are assumed to be simple: go to daycare 


Children between 3 




or school at school hours and stay inside households at all other 


and 18 years 




times. 


Adults between 18 


They go to work at working places during day time and stay 


and 60 years except 


inside households during night. They visit recreational places 


college students 


from time to time. 




They go to colleagues or universities during day time and stay 


College students 


inside dormitories during night. They visit recreational places 


between 18 and 25 




very often. 


Adults over 60 


They stay around households during day time and stay inside 


years 


households during night. They visit recreational places less often. 



Table 3: Activity types 


Activity Types 


Places 


Involved People 
Classes 




Office SLs, Classroom SLs (for students), 


All but children 


Work (W) 


Recreational SLs (for salesmen and 
waiters etc.), Patient Room SLs (for 


under 3 years and 
adults over 60 




doctors and nurses) 


years 


Home (H) 


Housing SLs 


All 5 classes 


Medical Care (M) 


Patient Room SLs (for patients) 


All 5 classes 


Recreation (R) 


Recreational SLs (for consumers) 


The latter 3 classes 
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Tabic 4: Activity Patterns 



Activity Patterns 


Percentages(%) 


HWH 


53.4 


HWH*H 


10.3 


HW*WH 


2.7 


HWHWH 


27.1 


HWHWH*H 


6.5 




Figure 1: Road Network 
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Figure 4: Travel Routing by Public Transportation 
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Figure 5: City Visualization 
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